Overview

Dataset statistics

Number of variables18
Number of observations844392
Missing cells1809345
Missing cells (%)11.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory122.4 MiB
Average record size in memory152.0 B

Variable types

Numeric9
DateTime1
Categorical8

Alerts

Open has constant value ""Constant
Sales is highly overall correlated with CustomersHigh correlation
Customers is highly overall correlated with SalesHigh correlation
Promo2SinceWeek is highly overall correlated with Promo2 and 1 other fieldsHigh correlation
Promo2SinceYear is highly overall correlated with Promo2High correlation
StoreType is highly overall correlated with AssortmentHigh correlation
Assortment is highly overall correlated with StoreTypeHigh correlation
Promo2 is highly overall correlated with Promo2SinceWeek and 2 other fieldsHigh correlation
PromoInterval is highly overall correlated with Promo2SinceWeek and 1 other fieldsHigh correlation
StateHoliday is highly imbalanced (99.3%)Imbalance
CompetitionOpenSinceMonth has 268619 (31.8%) missing valuesMissing
CompetitionOpenSinceYear has 268619 (31.8%) missing valuesMissing
Promo2SinceWeek has 423307 (50.1%) missing valuesMissing
Promo2SinceYear has 423307 (50.1%) missing valuesMissing
PromoInterval has 423307 (50.1%) missing valuesMissing

Reproduction

Analysis started2023-10-16 12:17:17.358081
Analysis finished2023-10-16 12:18:51.927215
Duration1 minute and 34.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Store
Real number (ℝ)

Distinct1115
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean558.42292
Minimum1
Maximum1115
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:18:52.223400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile56
Q1280
median558
Q3837
95-th percentile1060
Maximum1115
Range1114
Interquartile range (IQR)557

Descriptive statistics

Standard deviation321.73191
Coefficient of variation (CV)0.57614382
Kurtosis-1.1988414
Mean558.42292
Median Absolute Deviation (MAD)278
Skewness0.00041375446
Sum4.7152785 × 108
Variance103511.42
MonotonicityNot monotonic
2023-10-16T17:18:52.777447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
562 942
 
0.1%
769 942
 
0.1%
733 942
 
0.1%
423 942
 
0.1%
85 942
 
0.1%
262 942
 
0.1%
335 942
 
0.1%
682 942
 
0.1%
1097 942
 
0.1%
494 942
 
0.1%
Other values (1105) 834972
98.9%
ValueCountFrequency (%)
1 781
0.1%
2 784
0.1%
3 779
0.1%
4 784
0.1%
5 779
0.1%
6 780
0.1%
7 786
0.1%
8 784
0.1%
9 779
0.1%
10 784
0.1%
ValueCountFrequency (%)
1115 781
0.1%
1114 784
0.1%
1113 784
0.1%
1112 779
0.1%
1111 779
0.1%
1110 783
0.1%
1109 622
0.1%
1108 780
0.1%
1107 623
0.1%
1106 784
0.1%

DayOfWeek
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5203614
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:18:53.223274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile6
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.7236892
Coefficient of variation (CV)0.48963417
Kurtosis-1.2593101
Mean3.5203614
Median Absolute Deviation (MAD)2
Skewness0.01929954
Sum2972565
Variance2.9711045
MonotonicityNot monotonic
2023-10-16T17:18:53.725402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
6 144058
17.1%
2 143961
17.0%
3 141936
16.8%
5 138640
16.4%
1 137560
16.3%
4 134644
15.9%
7 3593
 
0.4%
ValueCountFrequency (%)
1 137560
16.3%
2 143961
17.0%
3 141936
16.8%
4 134644
15.9%
5 138640
16.4%
6 144058
17.1%
7 3593
 
0.4%
ValueCountFrequency (%)
7 3593
 
0.4%
6 144058
17.1%
5 138640
16.4%
4 134644
15.9%
3 141936
16.8%
2 143961
17.0%
1 137560
16.3%

Date
Date

Distinct942
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.9 MiB
Minimum2013-01-01 00:00:00
Maximum2015-07-31 00:00:00
2023-10-16T17:18:54.063901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:54.469324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Sales
Real number (ℝ)

HIGH CORRELATION 

Distinct21734
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6955.5143
Minimum0
Maximum41551
Zeros54
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:18:54.928368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3173
Q14859
median6369
Q38360
95-th percentile12668
Maximum41551
Range41551
Interquartile range (IQR)3501

Descriptive statistics

Standard deviation3104.2147
Coefficient of variation (CV)0.44629549
Kurtosis4.8520115
Mean6955.5143
Median Absolute Deviation (MAD)1694
Skewness1.593922
Sum5.8731806 × 109
Variance9636148.8
MonotonicityNot monotonic
2023-10-16T17:18:55.399491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5674 215
 
< 0.1%
5558 197
 
< 0.1%
5483 196
 
< 0.1%
6049 195
 
< 0.1%
6214 195
 
< 0.1%
5723 194
 
< 0.1%
5449 192
 
< 0.1%
5489 191
 
< 0.1%
5140 191
 
< 0.1%
5041 190
 
< 0.1%
Other values (21724) 842436
99.8%
ValueCountFrequency (%)
0 54
< 0.1%
46 1
 
< 0.1%
124 1
 
< 0.1%
133 1
 
< 0.1%
286 1
 
< 0.1%
297 1
 
< 0.1%
316 1
 
< 0.1%
416 1
 
< 0.1%
506 1
 
< 0.1%
520 1
 
< 0.1%
ValueCountFrequency (%)
41551 1
< 0.1%
38722 1
< 0.1%
38484 1
< 0.1%
38367 1
< 0.1%
38037 1
< 0.1%
38025 1
< 0.1%
37646 1
< 0.1%
37403 1
< 0.1%
37376 1
< 0.1%
37122 1
< 0.1%

Customers
Real number (ℝ)

HIGH CORRELATION 

Distinct4086
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean762.7284
Minimum0
Maximum7388
Zeros52
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:18:55.698867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile349
Q1519
median676
Q3893
95-th percentile1440
Maximum7388
Range7388
Interquartile range (IQR)374

Descriptive statistics

Standard deviation401.22767
Coefficient of variation (CV)0.52604266
Kurtosis13.313755
Mean762.7284
Median Absolute Deviation (MAD)179
Skewness2.7881104
Sum6.4404176 × 108
Variance160983.65
MonotonicityNot monotonic
2023-10-16T17:18:55.999997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
560 2414
 
0.3%
576 2363
 
0.3%
603 2337
 
0.3%
571 2330
 
0.3%
555 2328
 
0.3%
566 2327
 
0.3%
517 2326
 
0.3%
539 2309
 
0.3%
651 2299
 
0.3%
533 2298
 
0.3%
Other values (4076) 821061
97.2%
ValueCountFrequency (%)
0 52
< 0.1%
3 1
 
< 0.1%
5 1
 
< 0.1%
8 1
 
< 0.1%
13 1
 
< 0.1%
18 1
 
< 0.1%
36 1
 
< 0.1%
40 1
 
< 0.1%
44 1
 
< 0.1%
50 1
 
< 0.1%
ValueCountFrequency (%)
7388 1
< 0.1%
5494 1
< 0.1%
5458 1
< 0.1%
5387 1
< 0.1%
5297 1
< 0.1%
5192 1
< 0.1%
5152 1
< 0.1%
5145 1
< 0.1%
5132 1
< 0.1%
5112 1
< 0.1%

Open
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.9 MiB
1
844392 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters844392
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 844392
100.0%

Length

2023-10-16T17:18:56.365918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-16T17:18:56.652956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 844392
100.0%

Most occurring characters

ValueCountFrequency (%)
1 844392
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 844392
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 844392
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 844392
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 844392
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 844392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 844392
100.0%

Promo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.9 MiB
0
467496 
1
376896 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters844392
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 467496
55.4%
1 376896
44.6%

Length

2023-10-16T17:18:57.014600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-16T17:18:57.392433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 467496
55.4%
1 376896
44.6%

Most occurring characters

ValueCountFrequency (%)
0 467496
55.4%
1 376896
44.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 844392
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 467496
55.4%
1 376896
44.6%

Most occurring scripts

ValueCountFrequency (%)
Common 844392
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 467496
55.4%
1 376896
44.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 844392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 467496
55.4%
1 376896
44.6%

StateHoliday
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.9 MiB
0
843482 
a
 
694
b
 
145
c
 
71

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters844392
Distinct characters4
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 843482
99.9%
a 694
 
0.1%
b 145
 
< 0.1%
c 71
 
< 0.1%

Length

2023-10-16T17:18:57.589975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-16T17:18:57.925499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 843482
99.9%
a 694
 
0.1%
b 145
 
< 0.1%
c 71
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 843482
99.9%
a 694
 
0.1%
b 145
 
< 0.1%
c 71
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 843482
99.9%
Lowercase Letter 910
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 694
76.3%
b 145
 
15.9%
c 71
 
7.8%
Decimal Number
ValueCountFrequency (%)
0 843482
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 843482
99.9%
Latin 910
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 694
76.3%
b 145
 
15.9%
c 71
 
7.8%
Common
ValueCountFrequency (%)
0 843482
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 844392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 843482
99.9%
a 694
 
0.1%
b 145
 
< 0.1%
c 71
 
< 0.1%

SchoolHoliday
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.9 MiB
0
680935 
1
163457 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters844392
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 680935
80.6%
1 163457
 
19.4%

Length

2023-10-16T17:18:58.258566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-16T17:18:58.669060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 680935
80.6%
1 163457
 
19.4%

Most occurring characters

ValueCountFrequency (%)
0 680935
80.6%
1 163457
 
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 844392
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 680935
80.6%
1 163457
 
19.4%

Most occurring scripts

ValueCountFrequency (%)
Common 844392
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 680935
80.6%
1 163457
 
19.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 844392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 680935
80.6%
1 163457
 
19.4%

StoreType
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.9 MiB
a
457077 
d
258774 
c
112978 
b
 
15563

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters844392
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc
2nd rowa
3rd rowa
4th rowc
5th rowa

Common Values

ValueCountFrequency (%)
a 457077
54.1%
d 258774
30.6%
c 112978
 
13.4%
b 15563
 
1.8%

Length

2023-10-16T17:18:59.049789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-16T17:18:59.418797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 457077
54.1%
d 258774
30.6%
c 112978
 
13.4%
b 15563
 
1.8%

Most occurring characters

ValueCountFrequency (%)
a 457077
54.1%
d 258774
30.6%
c 112978
 
13.4%
b 15563
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 844392
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 457077
54.1%
d 258774
30.6%
c 112978
 
13.4%
b 15563
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 844392
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 457077
54.1%
d 258774
30.6%
c 112978
 
13.4%
b 15563
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 844392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 457077
54.1%
d 258774
30.6%
c 112978
 
13.4%
b 15563
 
1.8%

Assortment
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.9 MiB
a
444909 
c
391271 
b
 
8212

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters844392
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa
2nd rowa
3rd rowa
4th rowc
5th rowa

Common Values

ValueCountFrequency (%)
a 444909
52.7%
c 391271
46.3%
b 8212
 
1.0%

Length

2023-10-16T17:18:59.749447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-16T17:19:00.074576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 444909
52.7%
c 391271
46.3%
b 8212
 
1.0%

Most occurring characters

ValueCountFrequency (%)
a 444909
52.7%
c 391271
46.3%
b 8212
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 844392
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 444909
52.7%
c 391271
46.3%
b 8212
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 844392
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 444909
52.7%
c 391271
46.3%
b 8212
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 844392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 444909
52.7%
c 391271
46.3%
b 8212
 
1.0%

CompetitionDistance
Real number (ℝ)

Distinct654
Distinct (%)0.1%
Missing2186
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean5457.9796
Minimum20
Maximum75860
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:19:00.468794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile130
Q1710
median2320
Q36890
95-th percentile20390
Maximum75860
Range75840
Interquartile range (IQR)6180

Descriptive statistics

Standard deviation7809.4373
Coefficient of variation (CV)1.4308293
Kurtosis13.413381
Mean5457.9796
Median Absolute Deviation (MAD)1970
Skewness2.9751106
Sum4.5967432 × 109
Variance60987311
MonotonicityNot monotonic
2023-10-16T17:19:00.959584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
250 9210
 
1.1%
50 6249
 
0.7%
350 6239
 
0.7%
1200 6072
 
0.7%
190 6066
 
0.7%
90 5609
 
0.7%
180 5422
 
0.6%
150 5294
 
0.6%
330 5294
 
0.6%
140 4684
 
0.6%
Other values (644) 782067
92.6%
ValueCountFrequency (%)
20 779
 
0.1%
30 3116
0.4%
40 3890
0.5%
50 6249
0.7%
60 2342
 
0.3%
70 3736
0.4%
80 2331
 
0.3%
90 5609
0.7%
100 3900
0.5%
110 4516
0.5%
ValueCountFrequency (%)
75860 887
0.1%
58260 885
0.1%
48330 784
0.1%
46590 784
0.1%
45740 780
0.1%
44320 780
0.1%
40860 881
0.1%
40540 780
0.1%
38710 784
0.1%
38630 882
0.1%

CompetitionOpenSinceMonth
Real number (ℝ)

MISSING 

Distinct12
Distinct (%)< 0.1%
Missing268619
Missing (%)31.8%
Infinite0
Infinite (%)0.0%
Mean7.2248786
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:19:01.201687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median8
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.2101438
Coefficient of variation (CV)0.44431803
Kurtosis-1.247663
Mean7.2248786
Median Absolute Deviation (MAD)3
Skewness-0.17157646
Sum4159890
Variance10.305023
MonotonicityNot monotonic
2023-10-16T17:19:01.422712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
9 95467
 
11.3%
4 72256
 
8.6%
11 70032
 
8.3%
3 52685
 
6.2%
7 49009
 
5.8%
12 47887
 
5.7%
10 46198
 
5.5%
6 37759
 
4.5%
5 32862
 
3.9%
2 31360
 
3.7%
Other values (2) 40258
 
4.8%
(Missing) 268619
31.8%
ValueCountFrequency (%)
1 10297
 
1.2%
2 31360
 
3.7%
3 52685
6.2%
4 72256
8.6%
5 32862
 
3.9%
6 37759
 
4.5%
7 49009
5.8%
8 29961
 
3.5%
9 95467
11.3%
10 46198
5.5%
ValueCountFrequency (%)
12 47887
5.7%
11 70032
8.3%
10 46198
5.5%
9 95467
11.3%
8 29961
 
3.5%
7 49009
5.8%
6 37759
 
4.5%
5 32862
 
3.9%
4 72256
8.6%
3 52685
6.2%

CompetitionOpenSinceYear
Real number (ℝ)

MISSING 

Distinct23
Distinct (%)< 0.1%
Missing268619
Missing (%)31.8%
Infinite0
Infinite (%)0.0%
Mean2008.6977
Minimum1900
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:19:01.803312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1900
5-th percentile2001
Q12006
median2010
Q32013
95-th percentile2015
Maximum2015
Range115
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.9780481
Coefficient of variation (CV)0.0029760815
Kurtosis121.84638
Mean2008.6977
Median Absolute Deviation (MAD)3
Skewness-7.5221054
Sum1.1565539 × 109
Variance35.73706
MonotonicityNot monotonic
2023-10-16T17:19:02.141606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
2013 63108
 
7.5%
2012 61719
 
7.3%
2014 52815
 
6.3%
2005 46705
 
5.5%
2010 42716
 
5.1%
2011 41366
 
4.9%
2009 40713
 
4.8%
2008 40198
 
4.8%
2007 36131
 
4.3%
2006 35543
 
4.2%
Other values (13) 114759
13.6%
(Missing) 268619
31.8%
ValueCountFrequency (%)
1900 622
 
0.1%
1961 779
 
0.1%
1990 3887
 
0.5%
1994 1552
 
0.2%
1995 1404
 
0.2%
1998 766
 
0.1%
1999 6213
 
0.7%
2000 7631
 
0.9%
2001 12157
1.4%
2002 20736
2.5%
ValueCountFrequency (%)
2015 28844
3.4%
2014 52815
6.3%
2013 63108
7.5%
2012 61719
7.3%
2011 41366
4.9%
2010 42716
5.1%
2009 40713
4.8%
2008 40198
4.8%
2007 36131
4.3%
2006 35543
4.2%

Promo2
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.9 MiB
0
423307 
1
421085 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters844392
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 423307
50.1%
1 421085
49.9%

Length

2023-10-16T17:19:02.557031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-16T17:19:02.780085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 423307
50.1%
1 421085
49.9%

Most occurring characters

ValueCountFrequency (%)
0 423307
50.1%
1 421085
49.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 844392
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 423307
50.1%
1 421085
49.9%

Most occurring scripts

ValueCountFrequency (%)
Common 844392
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 423307
50.1%
1 421085
49.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 844392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 423307
50.1%
1 421085
49.9%

Promo2SinceWeek
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct24
Distinct (%)< 0.1%
Missing423307
Missing (%)50.1%
Infinite0
Infinite (%)0.0%
Mean23.253426
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:19:03.018070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q113
median22
Q337
95-th percentile45
Maximum50
Range49
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.100569
Coefficient of variation (CV)0.60638671
Kurtosis-1.3690876
Mean23.253426
Median Absolute Deviation (MAD)13
Skewness0.10641224
Sum9791669
Variance198.82603
MonotonicityNot monotonic
2023-10-16T17:19:03.385317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
14 60541
 
7.2%
40 51507
 
6.1%
31 33238
 
3.9%
10 32214
 
3.8%
5 29722
 
3.5%
37 27116
 
3.2%
1 26873
 
3.2%
13 24579
 
2.9%
45 24072
 
2.9%
22 23645
 
2.8%
Other values (14) 87578
 
10.4%
(Missing) 423307
50.1%
ValueCountFrequency (%)
1 26873
3.2%
5 29722
3.5%
6 771
 
0.1%
9 10293
 
1.2%
10 32214
3.8%
13 24579
2.9%
14 60541
7.2%
18 22456
 
2.7%
22 23645
 
2.8%
23 3558
 
0.4%
ValueCountFrequency (%)
50 780
 
0.1%
49 622
 
0.1%
48 7033
 
0.8%
45 24072
2.9%
44 2182
 
0.3%
40 51507
6.1%
39 3889
 
0.5%
37 27116
3.2%
36 7620
 
0.9%
35 18888
 
2.2%

Promo2SinceYear
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct7
Distinct (%)< 0.1%
Missing423307
Missing (%)50.1%
Infinite0
Infinite (%)0.0%
Mean2011.754
Minimum2009
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.9 MiB
2023-10-16T17:19:03.634306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2009
5-th percentile2009
Q12011
median2012
Q32013
95-th percentile2014
Maximum2015
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.6609621
Coefficient of variation (CV)0.00082562883
Kurtosis-1.03757
Mean2011.754
Median Absolute Deviation (MAD)1
Skewness-0.12274344
Sum8.4711944 × 108
Variance2.7587952
MonotonicityNot monotonic
2023-10-16T17:19:03.977361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2011 95040
 
11.3%
2013 91866
 
10.9%
2014 65768
 
7.8%
2012 60716
 
7.2%
2009 53826
 
6.4%
2010 46414
 
5.5%
2015 7455
 
0.9%
(Missing) 423307
50.1%
ValueCountFrequency (%)
2009 53826
6.4%
2010 46414
5.5%
2011 95040
11.3%
2012 60716
7.2%
2013 91866
10.9%
2014 65768
7.8%
2015 7455
 
0.9%
ValueCountFrequency (%)
2015 7455
 
0.9%
2014 65768
7.8%
2013 91866
10.9%
2012 60716
7.2%
2011 95040
11.3%
2010 46414
5.5%
2009 53826
6.4%

PromoInterval
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing423307
Missing (%)50.1%
Memory size12.9 MiB
Jan,Apr,Jul,Oct
242411 
Feb,May,Aug,Nov
98005 
Mar,Jun,Sept,Dec
80669 

Length

Max length16
Median length15
Mean length15.191574
Min length15

Characters and Unicode

Total characters6396944
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJan,Apr,Jul,Oct
2nd rowJan,Apr,Jul,Oct
3rd rowJan,Apr,Jul,Oct
4th rowJan,Apr,Jul,Oct
5th rowFeb,May,Aug,Nov

Common Values

ValueCountFrequency (%)
Jan,Apr,Jul,Oct 242411
28.7%
Feb,May,Aug,Nov 98005
 
11.6%
Mar,Jun,Sept,Dec 80669
 
9.6%
(Missing) 423307
50.1%

Length

2023-10-16T17:19:04.249654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-16T17:19:04.557574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
jan,apr,jul,oct 242411
57.6%
feb,may,aug,nov 98005
23.3%
mar,jun,sept,dec 80669
 
19.2%

Most occurring characters

ValueCountFrequency (%)
, 1263255
19.7%
J 565491
 
8.8%
u 421085
 
6.6%
a 421085
 
6.6%
A 340416
 
5.3%
c 323080
 
5.1%
t 323080
 
5.1%
r 323080
 
5.1%
p 323080
 
5.1%
n 323080
 
5.1%
Other values (13) 1770212
27.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3449349
53.9%
Uppercase Letter 1684340
26.3%
Other Punctuation 1263255
 
19.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 421085
12.2%
a 421085
12.2%
c 323080
9.4%
t 323080
9.4%
r 323080
9.4%
p 323080
9.4%
n 323080
9.4%
e 259343
7.5%
l 242411
7.0%
b 98005
 
2.8%
Other values (4) 392020
11.4%
Uppercase Letter
ValueCountFrequency (%)
J 565491
33.6%
A 340416
20.2%
O 242411
14.4%
M 178674
 
10.6%
F 98005
 
5.8%
N 98005
 
5.8%
S 80669
 
4.8%
D 80669
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 1263255
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5133689
80.3%
Common 1263255
 
19.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
J 565491
 
11.0%
u 421085
 
8.2%
a 421085
 
8.2%
A 340416
 
6.6%
c 323080
 
6.3%
t 323080
 
6.3%
r 323080
 
6.3%
p 323080
 
6.3%
n 323080
 
6.3%
e 259343
 
5.1%
Other values (12) 1510869
29.4%
Common
ValueCountFrequency (%)
, 1263255
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6396944
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 1263255
19.7%
J 565491
 
8.8%
u 421085
 
6.6%
a 421085
 
6.6%
A 340416
 
5.3%
c 323080
 
5.1%
t 323080
 
5.1%
r 323080
 
5.1%
p 323080
 
5.1%
n 323080
 
5.1%
Other values (13) 1770212
27.7%

Interactions

2023-10-16T17:18:39.778684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:05.438192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:09.931283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:14.607961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:20.271608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:24.891970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:28.511873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:32.700861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:36.337714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:40.228346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:05.958393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:10.482806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:15.380832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:20.964595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:25.395238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:29.159142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:33.151589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:36.673815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:40.702059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:06.487562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:10.994310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:16.003454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:21.501160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:25.843180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:29.677755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:33.627319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:37.022005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:41.192734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:06.969175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:11.491980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:16.633508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:21.951205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:26.272172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:30.245237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:34.211756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:37.395258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:41.591696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:07.520660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:12.067441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:17.296614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:22.526665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:26.719657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:30.657139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:34.616620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:37.776815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:41.960711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:08.075173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:12.591043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:17.951936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:22.917248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:27.083692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:31.038117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:34.967707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:38.155618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:42.328697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:08.550900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:13.037849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:18.547495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:23.410956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:27.440774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:31.417113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:35.287742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:38.442020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:42.776499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:08.979333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:13.521552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:19.100345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:23.775920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:27.788805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:31.743277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:35.640611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:38.883370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:43.375626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:09.397779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:14.004519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:19.598409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:24.153939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:28.105463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:32.169218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:35.965708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-16T17:18:39.276858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-16T17:19:04.892468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
StoreDayOfWeekSalesCustomersCompetitionDistanceCompetitionOpenSinceMonthCompetitionOpenSinceYearPromo2SinceWeekPromo2SinceYearPromoStateHolidaySchoolHolidayStoreTypeAssortmentPromo2PromoInterval
Store1.0000.0000.0010.031-0.046-0.0520.0030.0070.0340.0000.0070.0000.0980.1150.0720.160
DayOfWeek0.0001.000-0.179-0.146-0.0000.0000.001-0.0010.0030.4140.0260.2040.1680.1490.0290.007
Sales0.001-0.1791.0000.832-0.035-0.0420.0550.097-0.0340.3700.0550.0380.1120.0940.1190.051
Customers0.031-0.1460.8321.000-0.257-0.0320.0540.0380.0450.2080.0670.0230.3280.2720.2070.048
CompetitionDistance-0.046-0.000-0.035-0.2571.000-0.034-0.002-0.020-0.0890.0040.0130.0040.1620.1240.1600.068
CompetitionOpenSinceMonth-0.0520.000-0.042-0.032-0.0341.000-0.124-0.0390.0450.0000.0100.0000.1200.1050.1560.192
CompetitionOpenSinceYear0.0030.0010.0550.054-0.002-0.1241.000-0.0010.1020.0000.0030.0010.0650.1120.0840.130
Promo2SinceWeek0.007-0.0010.0970.038-0.020-0.039-0.0011.000-0.2160.0000.0140.0040.1540.2101.0000.596
Promo2SinceYear0.0340.003-0.0340.045-0.0890.0450.102-0.2161.0000.0000.0120.0060.1270.1931.0000.301
Promo0.0000.4140.3700.2080.0040.0000.0000.0000.0001.0000.0110.0290.0180.0130.0000.000
StateHoliday0.0070.0260.0550.0670.0130.0100.0030.0140.0120.0111.0000.0320.0710.0680.0100.004
SchoolHoliday0.0000.2040.0380.0230.0040.0000.0010.0040.0060.0290.0321.0000.0050.0040.0080.000
StoreType0.0980.1680.1120.3280.1620.1200.0650.1540.1270.0180.0710.0051.0000.5380.1080.072
Assortment0.1150.1490.0940.2720.1240.1050.1120.2100.1930.0130.0680.0040.5381.0000.0160.086
Promo20.0720.0290.1190.2070.1600.1560.0841.0001.0000.0000.0100.0080.1080.0161.0001.000
PromoInterval0.1600.0070.0510.0480.0680.1920.1300.5960.3010.0000.0040.0000.0720.0861.0001.000

Missing values

2023-10-16T17:18:44.140183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-16T17:18:46.173312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-10-16T17:18:50.561565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

StoreDayOfWeekDateSalesCustomersOpenPromoStateHolidaySchoolHolidayStoreTypeAssortmentCompetitionDistanceCompetitionOpenSinceMonthCompetitionOpenSinceYearPromo2Promo2SinceWeekPromo2SinceYearPromoInterval
0152015-07-3152635551101ca1270.09.02008.00NaNNaNNaN
1252015-07-3160646251101aa570.011.02007.0113.02010.0Jan,Apr,Jul,Oct
2352015-07-3183148211101aa14130.012.02006.0114.02011.0Jan,Apr,Jul,Oct
3452015-07-311399514981101cc620.09.02009.00NaNNaNNaN
4552015-07-3148225591101aa29910.04.02015.00NaNNaNNaN
5652015-07-3156515891101aa310.012.02013.00NaNNaNNaN
6752015-07-311534414141101ac24000.04.02013.00NaNNaNNaN
7852015-07-3184928331101aa7520.010.02014.00NaNNaNNaN
8952015-07-3185656871101ac2030.08.02000.00NaNNaNNaN
91052015-07-3171856811101aa3160.09.02009.00NaNNaNNaN
StoreDayOfWeekDateSalesCustomersOpenPromoStateHolidaySchoolHolidayStoreTypeAssortmentCompetitionDistanceCompetitionOpenSinceMonthCompetitionOpenSinceYearPromo2Promo2SinceWeekPromo2SinceYearPromoInterval
101658849422013-01-01311352710a1ba1260.06.02011.00NaNNaNNaN
101660651222013-01-01264662510a1bb590.0NaNNaN15.02013.0Mar,Jun,Sept,Dec
101662453022013-01-01290753210a1ac18160.0NaNNaN0NaNNaNNaN
101665656222013-01-018498167510a1bc1210.0NaNNaN0NaNNaNNaN
101677067622013-01-01382177710a1bb1410.09.02008.00NaNNaNNaN
101677668222013-01-01337556610a1ba150.09.02006.00NaNNaNNaN
101682773322013-01-0110765237710a1bb860.010.01999.00NaNNaNNaN
101686376922013-01-015035124810a1bb840.0NaNNaN148.02012.0Jan,Apr,Jul,Oct
101704294822013-01-014491103910a1bb1430.0NaNNaN0NaNNaNNaN
1017190109722013-01-015961140510a1bb720.03.02002.00NaNNaNNaN